Model Selection Framework for Graph-based data

نویسندگان

  • Rajmonda S. Caceres
  • Leah Weiner
  • Matthew C. Schmidt
  • Benjamin A. Miller
  • William M. Campbell
چکیده

Graphs are powerful abstractions for capturing complex relationships in diverse application settings. An active area of research focuses on theoretical models that define the generative mechanism of a graph. Yet given the complexity and inherent noise in real datasets, it is still very challenging to identify the best model for a given observed graph. We discuss a framework for graph model selection that leverages a long list of graph topological properties and a random forest classifier to learn and classify different graph instances. We fully characterize the discriminative power of our approach as we sweep through the parameter space of two generative models, the Erdös-Rényi and the stochastic block model. We show that our approach gets very close to known theoretical bounds and we provide insight on which topological features play a critical discriminating role.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new framework for high-technology project evaluation and project portfolio selection based on Pythagorean fuzzy WASPAS, MOORA and mathematical modeling

High-technology projects are known as tools that help achieving productive forces through scientific and technological knowledge. These knowledge-based projects are associated with high levels of risks and returns. The process of high-technology project and project portfolio selection has technical complexities and uncertainties. This paper presents a novel two-parted method of high-technology ...

متن کامل

A Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset

Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...

متن کامل

Evaluation and selection of sustainable suppliers in supply chain using new GP-DEA model with imprecise data

Nowadays, with respect to knowledge growth about enterprise sustainability, sustainable supplier selection is considered a vital factor in sustainable supply chain management. On the other hand, usually in real problems, the data are imprecise. One method that is helpful for the evaluation and selection of the sustainable supplier and has the ability to use a variety of data types is data envel...

متن کامل

A suitable data model for HIV infection and epidemic detection

Background: In recent years, there has been an increase in the amount and variety of data generated in the field of healthcare, (e.g., data related to the prevalence of contagious diseases in the society). Various patterns of individuals’ relationships in the society make the analysis of the network a complex, highly important process in detecting and preventing the incidence of diseases....

متن کامل

A New Framework for Distributed Multivariate Feature Selection

Feature selection is considered as an important issue in classification domain. Selecting a good feature through maximum relevance criterion to class label and minimum redundancy among features affect improving the classification accuracy. However, most current feature selection algorithms just work with the centralized methods. In this paper, we suggest a distributed version of the mRMR featu...

متن کامل

An Integrated Decision Making Model for Manufacturing Cell Formation and Supplier Selection

Optimization of the complete manufacturing and supply process has become a critical ingredient for gaining a competitive advantage. This article provides a unified mathematical framework for modeling manufacturing cell configuration and raw material supplier selection in a two-level supply chain network. The commonly used manufacturing design parameters along with supplier selection and a subco...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1609.04859  شماره 

صفحات  -

تاریخ انتشار 2016